Nachos would be an excellent product offering for Chipotle and should be sold in all regions.
Nachos can help grow sales and attract new customers to Chipotle. Historical data provides strong proof-of-concept, showing that nachos can add to overall sales as an additional menu offering. Accross all restaurants during the sample period, gross nacho sales were approximately $45k, slightly behind sales of quesadillas and slightly ahead of salads.
Few additional costs are required to add nachos to the menu. Nachos’ key ingredients are simple remixes of ingredients are already available at the service counter (lettuce, cheese, etc.). By adding such a popular menu item to its strong existing offerings, Chipotle has an opportunity to enhance its brand value among current customers and help bring in new customers through our doors.
A deep dive into optimal pricing can help set the best price to maximize revenue. The dataset provided contains a range of index prices and gross sales amounts. Subsequent investigations could use these data points as initial estimates for a target price level based on supplementary factors such as local sourcing costs. Note, however, that nachos’ similarity to Chipotle’s existing menu offerings may cause some customers simply to switch between products, substituting nachos for another item they would otherwise normally order. Further research could help shed light on this issue.
This notebook analyzes transaction-level data for Chipotle restaurants to help decide whether to roll out nachos as a menu offering across all Chipotle locations.
Basic data pre-processing steps were applied to the dataset, which reduced the overall number of records. The dataset started with roughly 912k records. Around 140k were removed to restrict the analysis to in-store, non-gift-card orders that were categorized as either dine-in, take-out, or waste. Two restaurants that did not offer nachos (locations 2403 and 6004) were also excluded from the analysis, as were a small number of outliers (e.g., order quantity greater than 1). The final number of records in the dataset was 561833.
Some inconsistencies were found in the data compared to descriptions provided in the assignment. The assignment states that Chipotle introduced Nachos to their menu in 2 markets on 10/1/2018. However, virtually all of the records provided contain nacho sales, and the dataset gives no indication of restaurant location or region. Additionally, individual transaction records start as early as September and continue into December, although the assignment states that the nacho campaign only ran until November. Ongoing dialogue and collaboration between the analytics research team and stakeholders around the business will help clarify these issues.
This section explains the data processing steps.
Loaded comma-separated file into memory (n=918412).
n=142105)2403 and 6004) had no nacho sales data at alln=5963 and n=8893, respectively)file_path <- "/Users/mj/Documents/Jobs/Chipotle/Assignment/nachos_sample_interviewees.csv"
rawdata <- read_csv(file_path) %>% arrange(RestaurantNumber, CheckDate)
filter1 <-
read_csv(file_path) %>%
filter(FulfillmentTypeName != 'In Store' |
OrderModeName %in% c('DINE IN', 'TAKE OUT', 'WASTE') == FALSE |
str_detect(ItemName, "Gift Card*") == TRUE)
df <-
read_csv(file_path) %>%
# sort
arrange(
RestaurantNumber,
CheckDate
) %>%
# filter/subset
filter(
FulfillmentTypeName == 'In Store' &
OrderModeName %in% c('DINE IN', 'TAKE OUT', 'WASTE') &
str_detect(ItemName, "Gift Card") == FALSE
) %>%
# modify
mutate(
Month = month.abb[lubridate::month(CheckDate)],
Week = dense_rank(lubridate::week(CheckDate)),
WeekDay = weekdays(CheckDate),
OrderModeName = as.factor(OrderModeName),
Item = str_split(ItemName, '-', simplify = TRUE)[,1],
Item = case_when(
str_detect(Item, "Kid's*") == TRUE ~ "Kid's",
str_detect(Item, "Extra*") == TRUE ~ "Extra",
str_detect(Item, "Taco*") == TRUE ~ "Tacos",
str_detect(Item, "Quesadilla*") == TRUE ~ "Quesadilla",
str_detect(Item, "Side*") == TRUE ~ "Side",
Item == "Boorito" ~ "Burrito",
TRUE ~ Item),
SaleType = if_else(str_detect(ItemName, 'Nacho*'), "Nachos", "Other Item")
) %>%
# drop extra columns
select(
-SalesChannelName,
-SalesChannelType,
-DigitalChannelName,
-FulfillmentTypeName,
-ItemID,
-ItemName
)
# Show that we only have two restaurants that don't sell nachos
tbl <- df %>%
group_by(RestaurantNumber) %>%
mutate(
StoreType = case_when(
n_distinct(SaleType) == 1 ~ "Doesn't Sell Nachos",
n_distinct(SaleType) == 2 ~ "Sells Nachos",
TRUE ~ NA_character_)) %>%
distinct(RestaurantNumber, StoreType) %>%
inner_join(
df %>%
group_by(RestaurantNumber, WeekDay) %>%
summarise(
GrossSales = sum(GrossSalesAmount),
AverageSale = mean(GrossSalesAmount)
))
print("Here are total and average sales per day-of-week by restaurant.")
[1] "Here are total and average sales per day-of-week by restaurant."
tbl
print("These restaurants did not sell nachos during this period.")
[1] "These restaurants did not sell nachos during this period."
nonacho <- df %>% filter(RestaurantNumber %in% c(2403, 6004))
nonacho
print("These are dropped records: either they belong to restaurants that did not sell nachos, have quantity greater than 1, or $0 gross sales amount.")
[1] "These are dropped records: either they belong to restaurants that did not sell nachos, have quantity greater than 1, or $0 gross sales amount."
filter2 <-
df %>% # Note we read the filtered dataframe from above, not the raw data
filter(Quantity > 1 | RestaurantNumber %in% c(2403, 6004) == TRUE | GrossSalesAmount == 0)
filter2
# Do final dataset modifications
df <- df %>% filter(
Quantity <= 1 &
RestaurantNumber %in% c(2403, 6004) == FALSE &
GrossSalesAmount != 0
)
# Get record counts
n0 <- dim(rawdata)[1]
n1 <- dim(filter1)[1]
n2 <- dim(filter2)[1]
n3 <- dim(df)[1]
n4 <- n1
# Verify counts match
# n0 == n1 + n2 + n3
print("This is the dataset used for analysis.")
[1] "This is the dataset used for analysis."
df
This section visualizes item price, total sales and units sold by item during the nacho campaign.
# Counts of items by type
df %>%
select(Item, SaleType) %>%
# gather(key, value, -SaleType) %>%
ggplot() +
geom_bar(
mapping = aes(
col = SaleType,
fct_relevel(fct_rev(fct_infreq(fct_lump(Item, n = 15))), "Other"))
) +
# facet_wrap(~key, scales = "free") +
xlab(label = "") +
theme_classic() +
coord_flip() +
ggtitle("Number Sold")
# total gross sales by item
tot <- df %>%
group_by(Item, SaleType) %>%
summarise(GrossSales = sum(GrossSalesAmount, na.rm = TRUE)) %>%
arrange(desc(GrossSales))
# take top 15
tot[1:15,] %>%
ggplot() +
geom_col(
aes(
x = reorder(Item, GrossSales),
y = GrossSales,
col = SaleType)) +
scale_y_continuous(labels = dollar_format()) +
xlab(label = "") +
coord_flip() +
theme_classic() +
ggtitle("Total Sales")
# use this snippet for labeling values
# +
# geom_label(
# aes(
# label = GrossSales,
# x = reorder(Item, GrossSales),
# y = GrossSales),
# size = 3)
# average price by item
avg <- df %>%
group_by(Item, SaleType) %>%
summarise(AveragePrice = mean(IndexPrice, na.rm = TRUE)) %>%
arrange(desc(AveragePrice))
# take top 15
avg[1:15,] %>%
ggplot() +
geom_col(
aes(
x = reorder(Item, AveragePrice),
y = AveragePrice,
col = SaleType)) +
scale_y_continuous(labels = dollar_format()) +
xlab(label = "") +
coord_flip() +
theme_classic() +
ggtitle("Average Price")